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Abstract 


Gas production from the Groningen Field located in the north of 
the Netherlands induces earthquakes of sufficiënt number and mag¬ 
nitude to cause concern about the level of seismic risk. An essential 
part of assessing this risk is a probabilistic seismological model to fore- 
cast the future probabilities of earthquakes induced under future gas 
production plans. Here we present an alternative seismological model 
that provides a more precise description of the relationship between 
observed seismicity and reservoir compaction. 

This has been achieved using a Poisson Point Process model to 
describe the nucleation rate of earthquakes in response to reservoir 
compaction and the Epidemie Type Aftershock Sequence model to 
describe the triggering of additional events. Joint maximum likelihood 
parameter estimates were obtained and Monte Carlo analysis was used 
to quantify uncertainties in these estimates. Simulations of seismicity 
using these parameter estimates are in good agreement with all aspects 
of past seismeity. 

The model achieves more reliable parameter estimates and more 
precise forecasts than the existing Strain Partitioning model. Fur- 
thermore, it provides a complete description of aftershocks that criti- 
cally influence the year-to-year variability in seismicity and therefore 
must be properly included in the Probabilistic Seismic Risk Analysis. 
This activity rate model provides the most complete framework yet 
for forecasting future seismicity under different plans for future gas 
production. 
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1 Introduction 


Gas production from the Groningen Field induces seismicity that in recent 
years has started to cause concerns about the future strength of earthquake 
ground motions and the resilience of existing buildings to these ground mo- 
tions. Ongoing gas production is depleting the pressure of hydrocarbon gas 
within the reservoir pore space causing the reservoir to compact. In turn, 
reservoir compaction increases the mechanical loads acting on pre-existing 
geological faults within and close to the reservoir. Some small fraction of 
these faults become unstable and are therefore prone to slip. Abrupt slip on 
such a fault results in an earthquake that radiates seismic energy. 

The established method for assessing the future likelihood of ground mo- 
tion events is Probabilistic Seismic Hazard Assessment (PSHA) (e.g. Cornell, 



Figure 1: (a) Location map of earthquakes in relation to the field bound- 
ary, faults mapped at the top Rotliegendes and reservoir compaction up to 
2013 according to the time-decay compaction model. Circle area denotes the 
magnitude as indicated by the legend. Map coordinates are given as kilome- 
tres within the Dutch National coordinate system (Rijksdriehoek), (b) The 
probability density function of epicentres estimated from the M > 1.5 events 
observed between April 1995 and August 2014 using the method of Gaussian 
kerneis. The kernei bandwidth was estimated according Scott’s rule (Scott, 
1992) and the results are expressed as an event number density. 
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1968; McGuire, 2008). The two essential elements of PSHA are a seismolog- 
ical model to describe the probability distribution of possible future earth- 
quake locations and magnitudes, and a ground motion model to describe the 
probability distribution of ground motions, such as peak ground accelera- 
tions, at a given distance from an earthquake of a given magnitude. The 
convolution of these two elements yields an estimate for the probability dis¬ 
tribution of future ground motion events at a site of interest. Likewise, the 
established method for assessing the future likelihood of building damage or 
injury due to earthquake ground motions is Probabilistic Seismic Risk As- 
sessment (PSRA). This is an extension of PSHA that further convolves the 
seismic hazard models with the probability of building damage for a given 
ground motion event and the probability of injury for a given building dam¬ 
age event. 

PSHA and PSRA for the Groningen Field both require a reliable, site- 
specific, model for seismicity induced by reservoir compaction. There is fun- 
damental physical reason to look for a relationship between induced seismic¬ 
ity and induced strain. An earthquake constitutes an increment of slip of 
one side of a fault surface relative to the other. This discontinuity in dis¬ 
placement across the fault surface represents an increment of strain. This is 
measured by the seismic moment of the event defined as 

M 0 = vDS, (1) 

where D is the average slip on the fault, S is the area of the fault slip, and /j is 
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Figure 2: Time series of M > 1.5 earthquake magnitudes versus reservoir 
compaction at the origin time and epicentre of each event. 
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the shear modulus of the surrounding medium. Consequently, a population 
of earthquakes within some spatial volume and time interval represent an 
average incremental strain due to seismic slip on faults (seismic strain) as 
described by Kostrov (1974) 
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where N is the number of events within the volume, V is the volume, M° 
and are the seismic moment and unit moment tensor of the k th event 
respectively. 

Similarly, geodetic measurements of surface displacements provide a mea- 
sure of the induced strain due to gas production in the form reservoir com- 
paction. Clearly, there must be a relationship induced strain and induced 
earthquakes. In the extreme case that all induced strain is accommodated by 
earthquakes the geodetic strain and seismic strains must be equal. Otherwise 
seismic strains must be smaller than the induced strains. In this case, other 
mechanisms accommodate the induced strain such as elastic and aseismic 
plastic deformations. The former stores elastic energy which remains avail- 
able for release by future earthquakes, the latter dissipates energy without 
any associated seismicity. The presence of pre-existing faults and other geo- 
logical heterogeneities means the relative contribution of seismic strain to the 
total induced strain may vary from place to place. However, these possible 
effects may not be visible if the controlling heterogeneities are sufficiently 
small-scale and densely distributed. 

Motivated by these physical considerations about induced earthquakes 
conforming in some manner to the induced strain field we start by investi- 
gating the relationship between induced seismicity and reservoir compaction. 
To that end we note that the distribution of observed earthquake epicentres 
conforms to the areal distribution of reservoir compaction inferred from the 
observed distribution of surface subsidence (Figure 1). Likewise the observed 
temporal distribution of earthquakes conforms to the development of reser¬ 
voir compaction through time (Figure 2) in the sense that earthquakes ex- 
hibit a sustained preference for regions of higher compaction through time. 
This suggests nucleation of earthquakes depends on compaction, but does 
not suggest a casual link between compaction and magnitude. If magni¬ 
tudes are independent of compaction, the larger magnitudes are still more 
likely to occur in regions where there are more events. The relationship 
with mapped faults is however rather uncertain at present because the ran- 
dom measurement errors in epicentral locations are large relative to typical 
distances between mapped faults (Figure 3). 
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Figure 3: The relationship between observed earthquake epicentres and faults 
mapped close to the top of the reservoir at the top Rotliegendes. Epicentre 
location errors are denoted by 95% and 67% confidence intervals based on a 
Standard horizontal location error of 500 m. 

One possible seismological model developed for the Groningen Field, 
called the strain partitioning (SP) model, describes the fraction of reser¬ 
voir strain that is released by earthquakes (Bourne et ah, 2014b) leading 
to a PSHA (Bourne et ah, 2014a). This model is based on an empirical 
stochastic relationship between strain partitioning and reservoir compaction. 
Kostrov (1974) and McGarr (1976) provide the underlying justification by 
demonstrating the fundamental connection between the total seismic moment 
released by a population of earthquakes and the average strain due to those 
earthquakes. This model is estimated from the reservoir compaction and the 
seismic moments of earthquakes observed since 1995. Given the importance 
of the seismological model for assessing seismic hazard and risk, we now seek 
to develop an alternative model based on an empirical stochastic relationship 
between the rate of earthquake nucleation and reservoir compaction. Simi- 
lar approaches have been used to describe injection-induced seismicity (e.g. 
Shapiro et ah, 2010, 2013; Mena et al., 2013). This activity rate (AR) model 
offers a different approach to model parameter estimation and also includes 
a description of the tendency of events to cluster around previous events as 
aftershocks. 

To begin, we will review the Standard formulation of a Poisson point pro- 
cess (Section 2) and then examine a particular instance of this model that 
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incorporates a linear relationship between reservoir compaction and the nu- 
cleation rate of earthquakes (Section 2.2). Deficiencies in this initial simple 
model motivate a generalisation of the model to incorporate an exponential 
compaction trend (Section 2.3) to achieve a history match of at least equiva¬ 
lent quality to the SP model. After recognising slight yet significant evidence 
for event clustering, we will describe an extension of the AR model to include 
an Epidemie Type Aftershock Sequence (ETAS) model (Section 4). Finally 
we review the evidence for earthquake nucleation clustering around reservoir 
compaction and previous earthquakes to determine if there is some residual 
effects that might be related to earthquakes clustering around pre-existing 
faults mapped close to the reservoir (Section 5). 
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2 The Poisson point process model 


Following the Standard model for point processes (e.g. Davison, 2003), sup- 
pose we observe events distributed on the time interval [0, t 0 ] and let N(w, w+ 
t ) denote the number of events observed within the sub-interval (w,w + t). 
If events within disjoint subsets are independent then it follows that the 
probability of no events within the sub-interval is 


Pr{N(w , w + t) = 0} = exp 


pw-\-t 


X(u)du ) , 


(3) 


where A (u) is the intensity function. Since the amount of time, T, from the 
event at w to the next event exceeds t if and only if N(w, w + t) = 0, it 
follows that the inter-event time, T, is a random variable with a probability 
density function obtained by differentiating 3 to obtain 

, dPv{N{w,w + t)=0} . . , . ( f w+t . , \ 

h =-^-= K w + t ) exp ( - J X(u)du\ . (4) 

The joint probability density of n independent events observed at times 
ti,... ,t n is the product of their respective probability density functions, 


L = exp 


fto 


X(u)du i X(ti). 


2=1 


(5) 


This is also the likelihood expression useful for estimating model parameters 
from the data. The probability of n events anywhere within the interval 
[0, to] is then the integral of this product with respect to t \,..., t n and leads 
to the result 


Pr (jV(0,£ o ) = n) 


A(t 0 ) n 

n\ 


exp(—A(t 0 )), 


( 6 ) 


where A(t 0 ) = f*° X(u)du. The number of events, N, is therefore a Poisson 
variable with mean A(t 0 ). 

The conditional probability density, p, that events occur at t\,, t n con- 
ditional on there being n events within the interval [0, to] is (5) divided by 
(6), i.e. 


p — n\ J 


X(U) 


where 0 < ti < • • • < t n < to 


tl A (*°) 

Using Sterling’s approximation, logfn!) 


(7) 


n log n — n, this may be computed as 


n 

logp = nlogn — n + ^^log A(tj) — nlog A(t 0 ), (8) 

2=1 


8 





with negligible error for n > 100. 

This Poisson process model in one dimension also extends to several di- 
mensions. Consider the case that A = A(x, t) such that event locations x are 
within the bounded region S and occurrence times t are within the interval 
[0,to]- The joint probability density of n independent events observed at 
locations xi,...,x n and times ti,... ,t n follows from (5) as, 

L = exp J A(x, tjd^dt^ JjA(xj,tj). (9) 


2.1 A homogeneous Poisson point process 

In the case of a constant intensity function, A(x, t) = A, the log likelihood 
for n events follows from (9) as 

£(A) = —At 0 X + n log A, (10) 


where A is the area of S. Note that in this example the intensity function 
has units (e.g. km -2 days -1 ), so to ensure the equation is dimensionally 
correct the log term should be interpreted as log(A/A 0 ), where A 0 is the unit 
intensity (e.g. 1 km -2 days -1 ). The maximum likelihood estimate for A is 
then found from the derivative of this expression with respect to A set equal 
to zero. This yields: 


A 


n 

Ata 


( 11 ) 


This maximum likelihood estimator is simply the average event rate per unit 
area. The conditional probability density of observing events at locations 
Xi,... ,x n and times ti,...,t n conditional on there being n events follows 
from (7) as 


n\ 

(a tóp 


( 12 ) 


2.2 A compaction trend Poisson point process 

Let us now consider the case that events preferentially occur in regions of 
greater absolute reservoir volume change. This suggests an intensity function 
of the form A = ac, where c is the rate of reservoir compaction with time and 
a is a constant that describes the number of events per unit reservoir volume 
change. The likelihood of n events arising at locations xi,..., x n and times 
> •• • j t n is 

n 

L = exp (-aAVVo)) JJöc(xj,L), (13) 

2=1 
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where AV(to) is the absolute reservoir volume change at time to- The log 
likelihood is 

n 

£ = -aAV(t 0 ) + ra log q + ^ log c(xj,tj). (14) 

2=1 

This means the maximum likelihood estimate for a is 

„ Tl 

a= AV{tö) 

From (7), and after generalising this expression from one dimension in time 
to three dimensions in time and space by including an integral over area, the 
conditional probability density of observing events at locations xi,..., x n 
and times t\,... ,t n given there are ra events is 


(15) 


n! Il 


AV(t 0 )' 


(16) 


This does not depend on a since it only influences the number of events and 
not their relative distribution in space and time. 

The relative likelihood of the compaction trend model with respect to the 
homogeneous model is simply the ratio of (16) and (12). The corresponding 
relative log likelihood is 


n 

Y, log c(x. t , tj) - ra log AV(t 0 ) + nlogAt 0 . (17) 

2=1 

Recognising that A V(to) = (c) a At (h where (c) a is the arithmetic average rate 
of reservoir thickness change over the spatial interval S and the time interval 
[0, 2 q]> this expression simplifies to 


n 

y log c(xj, U) - n log(c) 0 , (18) 

2=1 

The relative likelihood is therefore: 


<4A 

(C)a) 

where (c) g is the geometrie average rate of reservoir compaction at the time 
and location of each event. This means the relative likelihood simply depends 
on the ratio of the geometrie mean compaction rate for the events and the 
arithmetic mean compaction rate for the reservoir. 
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Based on the observed distribution of M > 1.5 earthquake epicentres 
and reservoir compaction between 1995 and 2013 in the Groningen Field 
(Figure 1), application of (19) yields a relative likelihood of 7 x 10 14 . This 
suggests epicentres are significantly more likely to be located in regions of 
greater compaction than could be expected simply due to chance that would 
distribute events with equal likelihood anywhere within the reservoir. Fig¬ 
ure 4 shows the cumulative number of events, n as a function of reservoir 
volume decrease, AV. According to (15), the data should plot with a con¬ 
stant slope equal to a, however the slope clearly increases with AV. Based on 
the Kolmogorov-Smirnov test statistic the compaction-trend Poisson Point 
Process model may be rejected within greater than 99% confidence. Figure 5 
shows the map distribution of local estimates for the number of events per 
unit volume change. These were obtained as the ratio of the event density 
map (Figure lb) to the reservoir compaction map (Figure la). This again 
indicates the larger intensities cluster around the larger values of reservoir 
compaction. 

Furthermore, estimates for a within disjoint intervals of reservoir com¬ 
paction (Figure 6) show an approximate exponential increase in a with re¬ 
spect to compaction. This suggests an alternative Poisson Point Process 
model based on an exponential compaction trend. Given the limited range 
in observed compaction values, alternative parametrizations, such as an in¬ 
verse power-law, are also similarly consistent with the data. As the current 
data do not allow us to distinguish between these alternatives, we choose for 
now to proceed with the exponential trend parametrization. 
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Figure 4: The observed cumulative distribution of events with reservoir vol¬ 
ume decrease relative to a homogeneous Poisson process. The model bounds 
denote the 0.95 and 0.99 quantiles of the Kolmogorov-Smirnov test statistic. 
Tick marks denote each individual event. 
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Figure 5: The map distribution of maximum likelihood estimates for the 
activity rate, a, suggests a trend with reservoir compaction. 
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Figure 6: (a) The number of M > 1.5 events per unit reservoir volume 
decrease increases with an approximate exponential trend relative to reser¬ 
voir compaction. (b) The exponential-like trend of strain partitioning with 
compaction is subject to considerably more variability. 
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2.3 An exponential compaction trend Poisson process 

Motivated by the observed trend in Figure 6, let us now consider an exponen¬ 
tial trend Poisson point process model as a function of reservoir compaction, 
c(t), of the form: 

= fte* 6 , (20) 

where A(S, t ) is the expected number of events within the region S and the 
time period (0, t), A V(S,t) is the bulk reservoir volume decrease within the 
region S at time t, and 

c=jJcdS, (21) 

where A is the area of region S. This is a two-parameter model, where 
Po describes the background activity rate and Pi describes the exponential 
increase in activity rate with compaction. Recognising that AR = Ac, where 
A is the surface area of the reservoir subject to compaction, c, leads to 

A = Acp o e 0lC . (22) 

Now, we seek a Poisson intensity function of the form, A = A(x, t), such 
that, A (to) = f s f*° A(x, t'jdSdt and also for subsets of S selected to contain 
approximately constant compaction then the expected number of events, A, 
follows equation 22. This implies 

A(x, t) = P 0 c( 1 + Pic)e^ ic , (23) 



where c = c(x, t), c(x, 0) = 0, c = dc/dt, and so it follows that 

A (to)= [ /5 0 c(x,t 0 )e /3lc(x ’ to) dA. (24) 

Js 

Consider for a moment, the special case of a small subset of the reservoir 
where c is approximately constant. From (7) it follows that the probability 
density function for a single event within the time interval (0, to) is 


A (A = c(H-ftic) c /3i( c -cq) 
A(t 0 ) Ac 0 


(25) 


This indicates an exponential skew in the distribution of event origin times 
towards the end of the time interval within this part of the reservoir. Conse- 
quently events are more likely to occur towards the end of the time interval. 
A similar exponentially skewed distribution exists for the general case of 
c = c(x, t). 
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The log-likelihood of n independent events observed at locations xi,..., x n 
and occurrence times ti,...,t n within the region S and the time period (0, to) 
follows from (9) as, 

n 

£ = -A(t 0 ) + ^logA(x i ,t i ). (26) 

2=1 

Combining (26), (23) and (24) leads to 

£ = — f /3 0 c(x, t 0 )e /3lc(x ’ fo M,S' + nlog/3 0 

JS , X 

n n n (27) 

+ ^log(l +p i c(x i ,t i )) + Pl ^2c(Xi,ti) + J^logc(x;,t;) 

2=1 2=1 2=1 

If earthquake observations only started at time t s after some initial period of 
reservoir compaction (0 ,t s ) then the log-likelihood of n independent events 
observed at locations xi,...,x n and occurrence times ti,...,t n within the 
region S and the time period (t s ,t 0 ) is 


£ = - [ Po (c(x,t 0 )e /3ic(x ’ to) - c(x,t s y iC ^) dS 

Js 

n n n 28 

+n log + J^log(l + Ac(x t.ti)) + ft ^c(xi,(() + J^logc(x ( , U) 

2=1 2=1 2=1 

In either case, maximum likelihood estimates for the model parameters (/3 0 , Pi) 
may be obtained by maximising (27) with respect to Po and P\. In the gen- 
eral case of c = c(x, t), this requires numerical integration and optimisation 
methods. Notice, however, that these maximum likelihood estimates do not 
depend on the rate of compaction, c, as the last term in (27) is a constant. 

To evaluate the log-likelihood function, numerical integration of the com¬ 
paction model is required at two moments in time, at the start of the earth¬ 
quake observation ( t s ), and at the end of the observation period (to)- In 
addition, the value of compaction at the location and occurrence time of 
each event must be obtained. Typically the reservoir compaction model is 
evaluated on a discrete grid in space and time. This means that estimates 
for compaction at times t s , ti,..., t„, t 0 must be obtained by interpolation 
from the times available within the model, t For now, we choose to proceed 
using simple piece-wise linear interpolation in time. 


2.3.1 Maximum likelihood parameter estimates 

Maximum likelihood estimates for Po, Pi were obtained numerically using 
the Nelder and Mead (1965) simplex algorithm to minimise the negative 
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Relative likelihood 


Figure 7: Relative likelihood of the exponential compaction trend Poisson 
Point Process model explaining the observed M > 1.5 earthquake distri- 
bution in relation to the reservoir compaction model from April 1995 to 
September 2014. The maximum likelihood solution is /3 0 = 3.09 x 10 -8 and 
j3\ = 13.03. Variation in relative likelihood around this solution provides an 
estimate of the conhdence interval. 

log-likelihood expression given by (28). Using the catalogue of M > 1.5 
earthquakes observed between April 1995 and August 2014 and the Time- 
decay reservoir compaction model yields a maximum likelihood estimate of 
Po = 3.09 x 10 -8 and 3\ = 13.03. Values of relative likelihood evaluated 
within the vicinity of this solution indicate uncertainty in this estimate. As 
expected, due to the limited number of earthquakes and limited range of 
reservoir compaction, there is a clear trade-off between do and di • Nonethe- 
less, the possibility of no covariance between seismicity and compaction may 
be conhdently rejected given d 0 > 0. Consistent with the previous analysis, 
a simple linear relationship between seismicity and compaction also may be 
conhdently rejected given Pi > 0. 
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2.3.2 Extension to a Marked Poisson Point Process 

The number of observed earthquakes, N, of at least magnitude, M, within 
a given region and period of time follow the log-linear frequency-magnitude 
relationship known as the Gutenberg-Richter law (Gutenberg and Richter, 
1954): 

log 10 N(M) = a — bM. (29) 

The b parameter, known as the 5-value, describes the relative decline in abun- 
dance of larger earthquakes relative to smaller ones and is therefore central to 
estimating the probability of larger earthquakes from an observed population 
of earthquakes. For a monitoring system with a magnitude threshold for de- 
tection and location of M min (magnitude of completeness) the corresponding 
probability density is 

f(M\M > M min ) = fj' e - b '( M - M min) (30) 

where b' = b log 10. This distribution is however not physical as there is no 
finite upper bound to the distribution. Instead, a truncated distribution is 
preferred such as proposed by Cornell and Vanmarcke, 

, /3 (Al A^rnin) p P ( Ad max M m j n ) 

f(M\M > M min ) = D --- _ M -, (31) 

— g P\ lvl max lvl min) 

where /? = b'/d, d = 1.5, and M max is the maximum possible magnitude. 

Groningen seismicity provides no empirical evidence for a maximum mag¬ 
nitude that truncates the frequency-magnitude distribution (Figure 8). How¬ 
ever one must exist to represent the upper bound on the total amount of 
strain energy available. Instead M max may be estimated using geomechanical 
principles under the assumption that the induced seismicity only accommo- 
dates induced strain due to reservoir compaction (see Bourne et ah, 2014b). 
In this way, the total induced strain available at the end of gas production 
provides a clear physical upper bound to the maximum magnitude. Starting 
with the fundamental relationship between the average strain due to a pop¬ 
ulation of earthquakes and their total seismic moment (Kostrov, 1974), this 
leads to an upper bound on total seismic moment according to the absolute 
value of the ultimate bulk reservoir volume change, |AR|, such that 

M 0 = 2/i|AR|, (32) 

This expression is comparable to that obtained by McGarr (1976) for a range 
of specific particular deformation geometries associated with subsurface vol¬ 
ume changes. The maximum total seismic moment computed for the Gronin¬ 
gen Field is 7 x 10 18 Nm. The relationship between the magnitude, M, and 
the seismic moment, M a , takes the form (Hanks and Kanamori, 1979). 
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log 10 M a = c + dM , (33) 

where typically c = 9.1 and d = 1.5. Hence the maximum total seismic 
moment corresponds to M max = 6.5 (Bourne et al., 2014b). 

An analytic expression for the maximum likelihood estimate of the 6-value 
given a catalogue of event magnitudes was derived by Aki (1965) and Utsu 
(1966). This follows from the log-likelihood as a function of the 6-value given 
the observed magnitudes Mi ,..., M n , as 

n 

l = n log b' - b'^Mi - M min ). (34) 

2=1 

Note that this excludes any contribution from M max which cannot be reliably 
estimated from the historie seismicity Bourne and Oates (2012) dne to the 
limited number of larger magnitude events. 

The maximum likelihood 6-value estimate is therefore 


ln 10 ((M)-M min ) 



Figure 8: The frequency-magnitude distribution of M > 1.5 events observed 
within the Groningen Field between April 1995 and September 2014 are con¬ 
sistent with a power-law distribution that is typically observed for most forms 
of natural and induced seismicity. The slope of this distribution, the 6-value, 
is consistent with the typical value of b = 1. Grey shading denotes the 95% 
confidence interval around this model. The dashed line shows a truncated 
frequency-magnitude distribution (see equation 31) with M max = 3.9. Given 
the confidence interval this improved fit is not statistically significant. 
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where (M) is the mean magnitude. If this Aki-Utsu equation is used in 
its original form the 6-values obtained give a noticeably different fit to the 
frequency-magnitude data than regression methods. This discrepancy is due 
to the finite width of the magnitude bins used, as pointed out by Marzocchi 
and Sandri (2003), among others, and may be corrected by replacing M min 
by M min — AM/2. If bins of width AM are used, the first magnitude bin 
contains events with magnitudes M min — AM/2 < M < M min + AM/2 
Marzocchi and Sandri (2003). 

2.3.3 A stochastic simulation procedure 

A simulation procedure is required to compute the number of events, and 
the location, origin time and magnitude for each individual event for each 
stochastic realisation of the earthquake catalogue within the period (t s ,to). 

The number of events is computed as a single sample from the Poisson 
distribution with a mean value equal to 

A(f 0 ) - A (t a ) =/3o [ c(x, f 0 )e /3lc(x ’ to) - c(x, Qe^^dS (36) 

Js 

The event locations are computed by sampling the expected event density 
function, iV(x), defined as the expected number of events per unit area, i.e. 


iV(x) = 0o (coe^ 1C0 - c s e^ Ca ) 


(37) 


where c s , c 0 denote the compaction at location x at times t a and t 0 respec- 
tively. 

The origin time of an event at a given location may be obtained by sam¬ 
pling the cumulative probability distribution, F(t), such that 

c t e^ lCt — c s e /3lCs 


m = 


(38) 


Co e^ lC ° — c s e$ lCs 

where c t , c s , c 0 denote the compaction at location x at times t, t s and t 0 
respectively. 

Earthquake magnitudes are simulated as independent samples from a 
truncated Pareto frequency-magnitude distribution 


F(M\M > M min ) = 


o /-") (Al Mmin) _ f fi(Mmax Mmin) 

| — f fi(Mmax M rnln ) 


(39) 


where ft = b'/d with 6=1, M min = 1.5, and M max = 6.5. This maximum 
magnitude is then adjusted after every simulated earthquake to ensure the 
total seismogenic strain cannot exceed the total reservoir strain, i.e. 


-Mmax dlog 10 (M o ,1 


M 


o,sim 


) 


(40) 
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where M o max = 7 x 10 18 Nm and M o mrn is the total seismic moment of all 
the previous earthquakes within the simulated earthquake catalogue. 

2.3.4 Stochastic simulation of historie seismicity 

Comparing the results of stochastic simulations with observed seismicity is 
essential to assess their validity and to identify any opportunities for further 
improvement. 

Figure 9(a-d) shows such a comparison for the temporal and spatial dis- 
tribution of event numbers and total seismic moments obtained after some 
10,000 earthquake catalogues were simulated for the period April 1995 to 
August 2014. The temporal distribution of seismicity is measured by the 
total number and total seismic moment occurring anywhere within the field 
as it increases with time since the start of the period in April 1995. Likewise 
the spatial distribution is indicated by the total number and total seismic 
moment occurring at any time within the period as it increases with distance 
from the centroid of the observed epicentres. In all cases the spread of sim¬ 
ulation results are represented by the median value and the 95% confidence 
interval around this median value. As expected, cumulative event numbers 
show a symmetrie Poisson distribution about the median and total seismic 
moments show an asymmetrie Pareto sum distribution with the upper confi¬ 
dence bound substantially further from the median than the lower confidence 
bound. 

There is good agreement between the observed and simulated temporal 
distribution of event numbers (Figure 9a), and similarly so for the temporal 
and spatial distributions of total seismic moment. However we do note that 
the median total seismic consistently exceeds the observed values through 
time, and there is also a transient exceedance below the 95% confidence 
bound in 2003. This suggests the possibility of a slight upward bias in the 
model meaning it systematically over-predicts the total seismic moment. One 
way to reduce this bias would be to increase the 6-value as this would lower 
the median total seismic moment without changing the median total event 
numbers. This is worthy of further investigation as it may be possible to 
achieve this improvement whilst maintaining the simulated 6-value within 
the margin of uncertainty around the 6-value estimated from the observed 
earthquakes, b = 1 ± 0.2 (Bourne and Oates, 2012). 

The most significant difference between the observed and simulated earth¬ 
quakes is the spatial spread of event numbers (Figure 9b). The observed 
events are more localised around the centroid of seismicity than the simu¬ 
lated events by 3 ± 1 km. This discrepancy is small compared to the 40 km 
length of the field. This indicates that out of the potential spatial variabil- 
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ity of epicentres within the field (ie. ±20 km) most is explained by the 
activity rate model based on compaction (85%). This does however leave 
some residual variability (15%) still to be modelled, perhaps by improving 
the compaction model through better quantification of uncertainty in the 
model after fitting the geodetic subsidence data, or through understanding 
the extent to which earthquakes are localised on mapped faults. 

Finally, we note that the distribution of times and distances between con- 
secutive events (Figure 9e,f) reveal a consistent and statistically significant 
model discrepancy of over-dispersion. In particular, the observed earthquakes 
show a greater abundance of events within 3-10 days and 1-10 km of each 
other than the model. This suggests that some of the earthquakes may be 
aftershocks triggered by previous earthquakes. 
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Figure 9: Simulation results from the exponential compaction trend Poisson 
process model, (a) The cumulative number of M > 1.5 events observed 
through time between April 1995 and August 2014 as a function of time. (b) 
As (a) except for the horizontal distance from the centroid of the observed 
locations. (c, d) As (a, b) except for total seismic moment, (e, f) The 
inter-event times and distances-squared provide a measure of temporal and 
spatial clustering respectively. The simulated results are shown as the median 
and the 95% conhdence interval. These results were obtained from 10,000 
catalogues simulated according to the exponential compaction trend Poisson 
model and a Standard random location error of 500 m. Model parameter 
estimates were sampled from the relative likelihood distribution shown in 
Figure 7. In this case the simulations are based on independent samples from 
a frequency-magnitude distribution with 6=1, M mm = 1.5, M max = 6.5 
initially, and a Standard magnitude error of 0.2. 
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3 Earthquake clustering 

3.1 Aftershocks and temporal clustering 

Figure 9e indicates the observed earthquakes are significantly more clustered 
in time than the maximum likelihood Poisson point process model. This 
is noticeable by the over abundance of events with inter-event times of less 
than 20 days. Differencing the Poisson model from the observed probabil- 
ity density distribution, i.e. differentiate the black and grey curves shown 
in Figure 9e and differencing them, provides a measure of the inter-event 
temporal trigger function (Figure 10) which provides a reasonable fit with 
Omori’s Law (inverse power-law) for a power-law exponent of 1.45 and a 
characteristic time of 3 days. 



Figure 10: Aftershock probability density as a function of time after the main 
event. This was estimated as the difference between the observed probability 
density and estimates of the probability density for independent events. The 
latter was based on the maximum likelihood estimate for the exponential 
compaction trend Poisson point process. Dashed lines denote the 95% con- 
fidence interval in the observed distribution. For comparison, Omori’s Law 
for aftershocks is shown for p = 1.45 and c = 3 days. 
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3.2 Aftershocks and spatial clustering 

The frequency distribution of inter-event distances (Figures 9f) indicates the 
observed earthquakes exhibit a small yet significant under-dispersion com- 
pared to the Poisson point model of independent event locations. Figure 11 
shows further information about the nature of spatial clustering of aftershock 
epicentres. Events that occur within 3 days of each other are more likely than 
not to be located with 10 km of each other (Figure 11c). Although, there is 
no clear evidence at present for any significant anisotropy in the distribution 
of spatial clustering (Figure 11b). 



Figure 11: Evidence for spatial clustering of aftershocks. (a) The horizontal 
offset vectors between all event pairs within 3 days of each other expressed 
as polar angles in degrees and distance in kilometres. (b) The frequency 
distribution of these offset azimuths. Black solid lines denote the expected 
value and the 95% confidence interval for an isotropic distribution. (c) The 
frequency distribution of these offset distances. 
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3.3 Aftershock productivity 

If we identify potential aftershocks as those events occurring within 3 days 
of a previous event, then there is some evidence that aftershock produc¬ 
tivity (average number of aftershocks per main event) increases with the 
magnitude of the main event (Figure 12a). Although, given the small num¬ 
ber of observed aftershocks this apparent trend remains somewhat uncertain 
and may change as more data are acquired. This trend is consistent with 
a power-law although due to the observed scatter about this trend there 
is some significant uncertainty about the value of the power-law exponent. 
The frequency-magnitude distribution of these aftershocks (Figure 12b) has 
a b -value that is indistinguishable from the b -value for the entire earthquake 
population, where b = 1 ( e.g . Bourne and Oates, 2012). Furthermore, the 
frequency distribution of magnitude differences between the main shock and 
its aftershocks shows most but not all aftershocks are smaller than the mag¬ 
nitude of the main shock. Both of these results are consistent with event 
magnitudes occurring independently of previous event magnitudes - that is 
there is no evidence of aftershock magnitudes depending on the magnitude 
of the main event. 

These observations of significant spatial and temporal correlations be¬ 
tween events but no correlation between event magnitudes are also typical of 
naturally occurring seismicity. Motivated by this similarity, we will now con- 
sider a general model for such inter-event correlations in natural earthquakes 
known as the Epidemie Type Aftershock Sequence model. 


25 



(b) 


10 " 


3 10 1 

cr 

CU 


ïcr 


O Data 

Model: b = 1 


1.5 


2.0 


2.5 3.0 

Magnitude M 


3.5 


4.0 



Figure 12: (a) The average number of M > 1.5 aftershocks per main shock, 
(b) The frequency-magnitude distribution of aftershocks. (c) The frequency 
distribution of magnitude differences between the main shock and an after- 
shock. Aftershocks were identified as events within 3 days of a previous 
event. 
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4 The Epidemie Type Aftershock Sequence 
Model 

Treating the seismological model as a point process with conditional inten- 
sity A, the log-likelihood given the time-ordered sequence of observed events 
(fj,Xj) follows from (9) as 

i = - f f X(t, x)dSdt + log(41) 

J S J i =1 

According to the Epidemie Type Aftershock Models Ogata (1998, 2011), the 
conditional intensity A is expressed as 


i—1 

A(x, t) = X p + ^2 f(U - x * - Xj\ M j) (42) 

3 = 1 

where X p is the intensity function for independent events, Mj is the magnitude 
of the j th event, and t u tj, x,, x. ; are the origin times and locations of the i th 
and j th events respectively. Function ƒ is the aftershock triggering function 
defined as 

ƒ = Kg(t)h(r)e aiM ~ Mo) (43) 

where t is the inter-event time, r is the inter-event distance, M is the event 
magnitude, M 0 is the minimum magnitude, and K, a are parameters of 
the ETAS model. The functions g(t) and h(r) are the probability density 
functions for temporal and spatial triggering defined as: 


9(t) 


(P~ 4) 
c 



(44) 


h(r) 


(9-1) 

7T d 



(45) 


and r 2 = x 2 + y 2 , x = (x,y), c,p are the characteristic time and temporal 
power-law exponent parameters, and d, q are the characteristic area and spa¬ 
tial power-law exponent parameters of the ETAS model. The corresponding 
cumulative probability density functions for temporal and spatial triggering 
follow by integration as 


G(i) = 1 



(46) 
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Similarly, the expected number of aftershocks directly triggered by a single 
event of magnitude M is 

Ke a(M-M 0 )_ ( 48 ) 

Given the physical requirement for the total number of events in any after- 
shock sequence to be positive dehnite and ünite 

0 < Ke a(M ~ Mo) < 1. (49) 

Next, let us consider maximum likelihood estimation of the model parame¬ 
ters. Starting with the simple illustrative example of a homogeneous Poisson 
point process, i.e. constant intensity X p = //, and a magnitude-independent 
ETAS process, i.e. a = 0. The log-likelihood function may be expressed as 

n 

£ = —fjjAT( 1 + K ) + ^ ' log(p + Kgihi). (50) 

i =1 

where A and T are the area and time period of observation and g % = g(ti), 
hi = h(r i '). The first term on the right hand side makes use of Schoenberg’s 
approximation for the integral contribution from the ETAS model that as- 
sumes the trigger function is negligible outside the area and time period of 
observation Schoenberg (2013). 

After differentiation and rearrangement, the following expressions for 
maximum likelihood estimates of /i and K may be obtained 

11 i 

AT(1 + K) = J2 ■ (51) 

i= i h + Kg^i 

fiAT = V —-. (52) 

A + Kg^i 

In the limit of weak event triggering, such that Kg^i <C /( for all i, these 
expressions may be simplified to yield 


A = Vho(gh) 

(53) 

l+K iw 

(54) 

„ Ti 

ho = 

(55) 
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where (gh) is the average value of the triggering function for all observed 
events. Here, po is recognisable as the maximum likelihood estimate for the 
rate of independent events in the limit of no aftershocks. Also notice that 
there is a trade-off between the estimates for p and K\ this is clearly seen 
by eliminating (gh) from the previous expressions to obtain /'dl + K) = Po- 
where p 0 is a constant. 

4.1 Joint parameter estimation 

Joint maximum likelihood estimates for the exponential compaction trend 
Poisson process model ((3 0 , (31) and the ETAS model (K,p,c,q,d,a) were 
obtained numerically using the Nelder and Mead (1965) simplex algorithm 
to minimise the negative log-likelihood expression for the historie seismic- 
ity and reservoir compaction data. The resulting parameters estimates are 
{/3 0 ,3i,K,a,p,c,q,d,a} = {2.3 x KT 8 ,12.8, 0.31,1.45, 3.0,1.9, 5 x 10 6 , 0.6}. 
These were obtained subject to the constraint that c = 3 days since this 
parameter is only weakly constrained by the small number of historie events. 

Figure 13 indicates the solution space according to the relative likelihood 
of the historie data arising from the model. This provides an impression of 
the confidence intervals for each parameter and the null-space due to trade-off 
between some parameters leaving the fit of the model to the data essentially 
unaffected. As expected from the previous discussion, there is a negative 
covariance between {K, (3o\ and {K, (3\\. The parameters {K, a} also exhibit 
a pronounced negative covariance; these are expected to be coupled since the 
expected number of aftershocks is Xe a ^ M ~ M °\ 

The lower right two plots in Figure 13 reveal unbounded null spaces for 
{p, c} and {q,d}. This occurs because although the observed aftershocks 
exhibit significant temporal and spatial clustering there are too few of them 
to uniquely constrain the spatial and temporal triggering functions since 
most aftershocks are observed within just 3 days and 10 km of the main 
event. With continued monitoring it is likely that the future earthquake 
catalogues will contain more aftershocks at greater times and distances from 
the main events to allow more precise estimates of p, c, q, d. 

For now we choose to obtain the maximum likelihood parameter esti¬ 
mates subject to the constraint c = 3 days which is consistent with the 
earlier graphical analysis. Uncertainty in these maximum likelihood values 
is then represented by a list of acceptable parameter combinations computed 
using Monte Carlo sampling of the relative likelihood function. This set of 
parameter combinations is then available to be re-sampled during stochastic 
simulations of earthquake catalogues. 

Finally we note that any attempt to de-cluster the earthquake catalogue in 
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Figure 13: A selection of slices through the relative likelihood distribution 
around the maximum likelihood parameter estimates. White dots denote the 
maximum likelihood parameter estimates for /3 0 , (31, K,p, q, d subject to the 
constraint c = 3 days. 


order to obtained unbiased estimates of Poisson model parameters is fraught 
with ambiguity. The intensity distributions of independent events and after- 
shocks are frequently too similar to allow any single event to be confidently 
labelled as an independent event or a triggered event (Figure 14). This dif- 
ficulty is avoided by joint estimation of the independent and triggered event 
process parameters as previously described, and then joint simulation of main 
shocks and aftershocks. 
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Year 


Figure 14: Intensity functions for the background initiation and aftershock 
triggering rates for the observed events. These functions are based on the 
joint maximum likelihood estimates obtained for the exponential compaction 
trend model with ETAS. 
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4.2 Simulation of historie events 

As before, we now seek to assess how the inclusion of an epidemie type af- 
tershock sequence model improves the performance relative to the observed 
seismicity (Figure 15). Notably the observed temporal and spatial cluster of 
consecutive events is now well-explained by the model within tight confidence 
bounds (Figure 15e,f). Relative to the previous activity rate model without 
aftershocks, the confidence bound on event numbers has also widened, pri- 
marily due to the upper bound increasing (Figure 15a,b). This departure 




Year Horizontal distance trom centroid [km] 






Figure 15: As Figure 9, except for the simulation results from the seismolog- 
ical model based on an exponential compaction trend Poisson process and 
an ETAS process. These simulations were based on Monte Carlo model pa¬ 
rameter estimates. The dark grey line denotes the median simulation and 
the light grey band denotes the 95% confidence interval about the median. 
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from a Poisson distribution is expected as the events are no longer entirely 
independent of each other and some tendency towards event clustering will 
necessarily increase the variability in event numbers. The discrepancy in 
spatial localisation between the observed events and the median simulation 
remains unchanged but due to the increased variability, its statistical signif- 
icance is slightly reduced to 3 ± 2 km (Figure 15b). 

The slight systematic bias in total seismic moment remains as before, 
with the median simulation always exceeding the observed values from 1996 
onwards. Again, as the bias does not appear to change with time (ie. con¬ 
stant gap between the grey and black lines), one clear opportunity to reduce 
this bias is to investigate increasing the 6-vahie slightly above its maximum 
likelihood estimate of b = 1.0 ± 0.2 but still within its confidence interval. 

4.2.1 Map variability 

Figure 16 maps the observed and the median simulated event number density 
distributions and the differences between these two. Due to the relatively 
small number of events in the catalogue, there are considerable stochastic 
fluctuations in the event densities found in different simulated catalogues. 
From the confidence bound shown in Figure 15b it is apparent that 95% 
of this variability remains within ±30% of the median value. The largest 
event number densities found on these maps are 1 km -2 (observed) and 
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Figure 16: Maps of the observed and simulated number density and the 
residuals between them. Events densities were computed using the Gaussian 
kernei method. Simulation results were obtained using the full probability 
distribution of parameter estimates for the exponential compaction trend 
activity rates with epidemie type aftershock sequences. 
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0.6 ± 0.2 km 2 (median simulated ± 95% confidence interval). This is a 
statistically significant difference and shows up as the largest red region in 
the residual map. However, as previously seen in Figure 15b this discrepancy 
is equivalent to a 3 ± 2 km shift in the simulated epicentres away from the 
centroid of observed epicentres. 

There are two other regions of notable residuals. The largest is an area 
of negative residuals (blue) in a 10 km wide strip located just inside the field 
boundary and extending 25 km from the northern limit of the field to its 
eastern limit. The smallest is an area of positive residuals (red) located along 
the south-west field boundary. Both of these are also statistically significant 
although they relate to much smaller discrepancies. 

We note the map of residuals between surface subsidence observed by 
geodetic levelling networks and computed from any of the existing com- 
paction models (pers. comm. Biermann, 2014) shows a quite similar spatial 
pattern of residuals. These compactions models appear to systematically 
over-predict the observed subsidence along the north to north-east boundary 
of the field and under-predict subsidence along the south-west field bound¬ 
ary. This presents a clear opportunity to make a single update to these 
compaction models to improve the fit to historie subsidence and seismicity. 

A further potential source of error in the existing compaction models is 
that they are possibly too smooth on length-scales less than 3 km. This 
is generally likely as surface subsidence is largely insensitive to variability 
in reservoir compaction on these length-scales because the 3 km thick over- 
burden effectively filters this information out before it reaches the surface. 
This is a fundamental limit in the use of subsidence measurements to infer 
reservoir compaction as it takes a order of magnitude improvement in the 
signal-to-noise of subsidence measurements. However, we do recognise there 
may be other opportunities to improve the lateral resolution of reservoir com¬ 
paction, such as using measurements of both horizontal and vertical surface 
displacements, or optimising the compaction model to simultaneously match 
the observed surface displacements and reservoir seismicity. 

4.2.2 Annual variability 

To better appreciate the year-to-year variability in the Acticity Rate model, 
Figure 17 shows how the annual event numbers and annual total seismic 
moments compare to this model between 1995 and 2014 and forecasts for 
future seismicity from 2014 to 2020 based on the current production plan. In 
both cases, the median simulation smoothly follows the trend of increasing 
annual seismicity with time, and the confidence bounds bracket the consider- 
able year-to-year variability in observed seismicity. This is achieved without 
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Figure 17: Annual number of M > 1.5 events, and the annual total seismic 
moment based on simulations of the Activity Rate model with aftershocks 
from 1996 to 2020 as compared to the observed seismicity from 1996 to 2014. 
The dark grey line denotes the median simulation and the light grey band 
denotes the 95% confidence interval about the median. 

excessive large uncertainty bounds. The one slight and transient exception is 
due to the total seismic moment observed in 2001 just outside the lower con¬ 
fidence bound. This likely reflects the systematic bias in the model causing 
a tendency towards slight over prediction, which as stated before might be 
fixed by a slight increase in the 6-value used for simulating event magnitudes. 


4.2.3 Strain partitioning and 6-values 

As an additional check, we note the Activity Rate model also reproduces 
the observed exponential trend in strain partitioning with compaction (Fig¬ 
ure 18a). Strain partitioning is measured as the fraction of reservoir strain 
accommodated by seismogenic fault slip and so depends on the total seismic 
moment per unit reservoir volume change. The Activity Rate model was not 
calibrated on this trend but independently reproduces it. This happens be- 
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Figure 18: (a) Strain partitioning and (b) maximum likelihood estimates of 
6-value as functions of reservoir compaction. Observed seismicity is from 
April 1995 to August 2014. Simulations are based on the activity rate model 
with aftershocks shown in Figure 15. The dark grey line denotes the median 
simulation and the light grey band denotes the 95% confidence interval about 
the median. 


cause event numbers per unit reservoir volume change increase exponentially 
with compaction and more events means more chance of a larger magnitude 
event and hence a larger total seismic moment. 

Finally, we note the constant 6-value assumed by seismological model is 
reproduced by the simulations and the apparent trend of decreasing 6-value 
with increase compaction is hard to substantiate (Figure 18b). This is due 
to larger uncertainties in the estimated 6-values that are inherent with the 
small sample size (N = 50) that must be used in any attempt at present to 
discern a trend with compaction. 

4.2.4 Sensitivity to aftershock productivity 

Recall that estimation of the ETAS a-value was largely uncertain due to the 
current limited range of magnitudes observed (Figure 13, top right). To in- 
vestigate the influence of this epistemic uncertainty about the magnitude de- 
pendence of aftershock productivity, we ran additional simulations for a = 0 
and a = 1.8 whilst all other parameters were fixed to their maximum likeli¬ 
hood estimates. Figure 19 shows the that the median model remains a good 
fit to the observed number of events through time. The most notable differ- 
ence is the much larger upper 95% confidence bound for a = 1.8. This makes 
sense for the following reason. For o = 0, event numbers do not depend on 
previous event magnitudes and the simulated event numbers are most like a 
Poisson distribution. For larger a-values the total number of events depends 
on previous event magnitudes that follow a Pareto distribution with its heavy 
tail above the median. Consequently, the ETAS a-parameter acts to propa- 
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Figure 19: The estimated confidence interval for event numbers depends on 
the ETAS a parameter: (a) a — 0, (b) a = 1.8. Larger a-values means a 
larger upper bound to the confidence interval due to the larger number of 
aftershocks associated with larger magnitude events. Confidence bounds for 
a > 0 become increasing asymmetrie about the median due to this coupling 
with event magnitudes that follow a Pareto frequency distribution. The dark 
grey line denotes the median simulation and the light grey band denotes the 
95% confidence interval about the median. 

gate the Pareto magnitude distribution into the event number distribution. 
This makes the ETAS a-parameter a clear target for reducing epistemic un- 
certainty about future seismicity and in particular obtaining a reliable upper 
bound to activity rate forecasts such as shown in Figure 17. 


5 Do earthquakes cluster on mapped faults? 

An earthquake typically envolves slip on a fault which, more likely than not, 
will exploit and pre-existing fault as a surface of least resistance to slip. 
However, our knowledge of both earthquakes and faults within the Gronin¬ 
gen Field is limited. Due to the current sparse nature of the earthquake 
monitoring network, earthquake depths are essentially unknown and their 
epicentres are uncertainty to within a Standard random measurement error 
of 500 m (pers. comm. Doost, 2013). Due to typical resolution limits of the 
seismic reflection image pre-existing faults with throws less than 15 m can- 
not be reliably identified and mapped. Despite the considerable variability in 
maximum fault throw to fault length ratio (e.g. Kim and Sanderson, 2005), 
the typical value of 10 -2 suggests faults shorter than 1500 m will be too small 
to map. A 1500 m long fault would be too small to reliably map but may 
still be large enough to host a magnitude 5 earthquake with a 10 MPa stress 
drop (Bourne and Oates, 2013, Table 2). 

Although it is appealing to speculate about earthquakes clustering on 
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mapped faults - this need not be the case. Indeed, all the observed earth- 
quakes, in principle, could have occurred on pre-existing faults that are too 
small to be mapped. The only way to decide is to compare the distribu- 
tion of earthquake epicentres and mapped faults to test the hypothesis that 
earthquakes possess some tendency to preferentially cluster around mapped 
faults. 

To begin, let us first consider which fault segments might be associated 
with most earthquakes. One simple way to do this is the move each earth¬ 
quake onto the closest mapped fault segment and then count the number 
and seismic moment density along these fault segments (Figure 20). This re- 
veals that every mapped fault within the general vicinity of the earthquakes 
is highlighted. There is no evidence that faults of any particular strike ex- 
perience preferential seismicity. This makes sense that the uncertainty in 
epicentres is similar to the faults spacing (Figure 3). 

Next, let us look in more detail at the distribution of offset distances 
between epicentres and the closest mapped fault to see if there is any weak 
evidence for earthquakes clustering closer to mapped faults rather than ap- 
pearing with equal likelihood anywhere between mapped faults. The results 
of this analysis are shown in Figure 21. This reveals that M > 1.5 events 
exhibit a slight but statistically significant bias towards mapped faults when 
compared to stochastic simulations conditioned on the compaction but with 
no knowledge of the mapped faults themselves. This shows that half of the 
earthquakes are observed within 200 m of a mapped fault whereas for the 
simulated earthquakes this distance is 300 ±50 m. This suggests the observed 
earthquakes are located at least 100 m closer to mapped faults than would 
be expected on the basis of pure chance. When we repeat this analysis for 
M > 2, 2.5, 3 there is a similar discrepancy but due to the smaller number of 
events available the confidence interval on the simulations are larger enough 
to explain these differences according to chance alone. As M < 3.5 earth¬ 
quakes make no significant contribution to the seismic hazard Bourne et al. 
(2013) or risk the current evidence for clustering of M < 2.5 events around 
mapped faults does not need to be included in the seismological model. 

Looking at this issue from another direction, we also tested the possi- 
bility that every observed event actually originated on a mapped fault but 
was miss-located due to the 500 m Standard random error in epicentre de- 
termination. These two effects were included in the seismological model by 
simulating events as before, then relocating them on the closest mapped fault, 
and then relocating a second time by measurement error that was sampled 
from an isotropic bi-variant Gaussian distribution with a Standard deviation 
of 500 m. This procedure happens to yield an excellent match with the ob¬ 
served distribution of epicentre to mapped fault distances for all magnitude 
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thresholds (Figure 22). 

All these results mean that the current distribution of earthquakes could 
have originated from the distribution of mapped faults but we lack the pre¬ 
cision to confidently reject the alternative hypothesis that mapped faults do 
not influence earthquake locations, at least those large enough to influence 
the seismic hazard and risk assessment. The upgraded earthquake monitor- 
ing network will provide a better opportunity to decisively resolve this issue. 
However, if all earthquakes are discovered to occur on mapped faults this is 
still unlikely to materially affect the seismic hazard and risk because it will 
typically require moving the simulated earthquake epicentres by only 300 m 
which will have little influence on the resulting ground motion distributions 
3 km above the reservoir. 
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Figure 20: Estimates for the event number density (left) and the seismic 
moment density (right) obtained from the M > 1.5 events observed between 
April 1995 and December 2013 conditional on every event originating on a 
mapped fault. 
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Figure 21: The fraction of epicentres located within a given distance of a 
mapped fault for all events of at least magnitude (a) 1.5, (b) 2.0, (c) 2.5, 
and (d) 3.0. For comparison, simulation results are shown as the median and 
95% conhdence interval according to the strain partitioning model with no 
clustering on mapped faults. 
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Figure 22: As Figure 21, except the simulations include clustering on mapped 
faults with a Standard epicentral location error of 500 m. 
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6 Conclusions 


Between 1960 and 2014 more than 2000 x 10 9 m -3 of hydrocarbon gas has 
been produced from the Groningen Field. This has induced a more-or-less 
uniform reservoir pressure depletion exceeding 25 MPa, a maximum surface 
subsidence of at least 350 mm, and more than 235 M L > 1.5 earthquakes with 
a historie maximum of Ml = 3.6 in August 2012. The induced seismicity is 
a cause of considerable concern due to the possibility of future earthquake 
ground motions causing damage to buildings and also possibly injury to the 
occupants. 

Probabilistic seismic risk assessment is a well-established method for char- 
acterising the size of this risk as well as its geographical extent and distri- 
bution across different building typologies. If necessary, options to mitigate 
this seismic risk include reservoir pressure maintenance, regulating future 
gas production rates, retro-fitting selected existing buildings to improve their 
structural resilience and ensuring new buildings are designed for earthquake 
resilience. 

A necessary part of probabilistic risk assessment is a seismological model 
that provides a means of forecasting the probability distribution of future 
earthquakes in terms of their numbers, locations, magnitudes, and focal 
mechanisms. The first seismological model developed for the Groningen 
Field was the Strain Partitioning model (Bourne and Oates, 2013; Bourne 
et ah, 2014b). To assess and potentially reduce epistemic uncertainties in this 
model, an alternative model has been developed - the Activity Rate model. 
Both models exploit the observed conformance between historical subsidence 
and seismicity. The geomechanical basis for this is that the induced earth¬ 
quakes play a role in accommodating the induced deformations within and 
around the reservoir due to reservoir compaction. 

There are two key measures of seismicity in relation to this deformation, 
the total seismic moment per unit reservoir volume decrease and the total 
number of events per unit reservoir volume decrease. Both measures exhibit 
clear exponential-like trends with reservoir compaction. The Activity Rate 
model succeeds in representing this trend more precisely than the Strain 
Partitioning model. Moreover it benefits from more reliable parameter esti- 
mates using formal maximum likelihood methods and also avoids the need 
for expert judgement as the data do not need to be arranged in bins. This 
contrasts with the Strain Partitioning model that uses an approximation to 
the Pareto sum distribution and requires a choice of bin size to estimate 
parameters linking seismicity to compaction. 

The evidence for aftershocks within the historie earthquake catalogue 
is not obvious with casual inspection due to the small number of observed 
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events, but simulation-based analysis reveals statistically significant evidence 
for their presence. Furthermore, the temporal, spatial and magnitude trig- 
ger effects may be quantified in accordance with the Epidemie Type After- 
shock Sequence model. Joint maximum likelihood estimation of the com- 
paction and aftershock model parameters leads to an Activity Rate model 
that matches almost all aspects of the observed seismicity. 

Two minor exceptions offer opportunities for further improvement to the 
Activity Rate model. First, the model has a slight tendency to over-predict 
the total seismic moments. This will likely be resolved by including the 
frequency-magnitude ö-value parameter in the joint maximum likelihood es¬ 
timation scheme. Second, the model has a slight tendency to under-predict 
the spatial localisation of event numbers (< 3 km) within the region of great- 
est compaction and seismicity. This will likely be resolved by updating the 
reservoir compaction model to improve its fit to geodetic observations of sur- 
face displacements as well as quantification of uncertainty in the estimated 
values of the compaction model parameters. 
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7 Recommendations for further work 


We identify the following opportunities for further improvements to the seis- 
mological model: 

• Investigate updating the existing compaction models to improve the fit 
to geodetic measurements of surface displacements and to improve the 
performance of the seismological model. 

• Quantify the influence of compaction uncertainty on the seismological 
model and incorporate these uncertainties into the existing work flow 
for Probabilistic Seismic Hazard and Risk Assessment. 

• Investigate the utility of alternative functional forms for the relation- 
ship between the rate of event nucleation and the rate of reservoir 
compaction such as an inverse power-law. 

• Investigate the possibility of 6-valuc decreasing with compaction, or 
otherwise revising the 6-value to avoid the slight tendency to over- 
predict seismic moments. Also assess the sensitivity to uncertainty 
in 6-value, possibly through including 6-value in the joint parameter 
estimation process. 

• Investigate the use of Bayesian inference (using MCMC) as a more 
computationally efficiënt alternative to the current maximum likeli- 
hood Monte Carlo approach for parameter estimation and stochastic 
simulation. 
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Appendices 

A The strain-partitioning model 

For comparison with the activity rate model, Figures 23 and 24 show the 
performance of the strain-partitioning model over the period of historie seis- 
micity from April 1995 to August 2014. These are quite similar in terms 
of temporal and spatial distribution of tot al seismic moments. The median 
event numbers are also similarly distributed. However, the most notable 
difference is that the upper bound of the 95% confidence interval for event 
numbers is significantly larger. This reflects the strong dependence on the 
Pareto sum distribntion as event numbers are determined by the modelling 
requirement to achieve a certain tot al seismic moment. This conld be inter- 
preted as a member of the activity rate model where the ETAS a-parameter 
becomes large. 



(a) 


(b) 



Horizontal distance from centroid [km] 






Figure 23: Comparison of simulated and observed data for the strain parti- 
tioning model. 
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Figure 24: As Figure 16, except for the strain partitioning model. 
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